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Abstract 

This paper shows that the Levenberg-Marquardt Algorithms (LMA) algorithms can be merged into the Gauss Newton Filters (GNF) 
to track difficult, non-linear trajectories, without divergence. The GNF discusssed in this paper is an iterative filter with memory 
that was introduced by Norman Morrison 1 1]. The filter uses back propagation of the predicted state to compute the Jacobian 
matrix over the filter memory length. The LMA are optimisation techniques widely used for data fitting |2]. These optimisation 
techniques are iterative and guarantee local convergence. We also show through simulation studies that this filter performance is 
not affected by the process noise whose knowledge is central to the family of Kalman filters. 
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1. Introduction 



This paper shows that the Levenberg-Marquardt Algorithms (LMA) can be merged into the Gauss Newton Filters 
5h ' (GNF) to track difficult, non-linear trajectories, without divergence . 1 3.^ 



for initialising tracking filters [7, 



4|, IS la] • In the past, the LMA has been used 
In this paper we show that the LMA can be merged into the flexible GNF 
filters to produce a hybrid formulation with very powerful convergence properties even in highly non-linear input data 
situations. The hybrid filter is also self initialising. 

These optimisation techniques are iterative 



The LMA are optimisation techniques widely used for data fitting 1 3] 

and guarantee convergence in a specified region i.e. they do necessarily produce global minimal 10]. They are also 

used in most neural networks algorithm Jl^ll2[[l3i . 

The Gauss Newton filter (GNF) discussed in this paper was introduced by Morrison! 1 ] to tracking and smoothing 

at about the same period as the Kalman filter, but it received little attention due to its computational requirements, 
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problematic for the limited computers of the time. The GNF is iterative and non-recursive, with memory that can be 
adaptively controlled. The GNF differs from the Gauss-Newton optimisation methods discussed in the literature as 
it provides a different method for computing the Hessian matrix 111411 . This flexibility makes the GNF filter highly 
suitable for tracking in strongly non-linear situations. 

In this paper we adapt the GNF to the LMA method (which we call the Morrison LMA Filter) and we state that 
this filter can be used for radar target tracking without the risk of divergence. We also show through simulation studies 
that this filter performance is not affected by the process noise whose knowledge is central to the family of Kalman 
filters. The literature on the use of LMA as a tracking algorithm are rare, possibly due to lack of exposure to Morrison 
approach in the GNF. 

The LMA is well known as an aid for track initiation 
as an initiation tool in our hybrid filter, but rather as an integral part of the filter. The paper starts in Section[2]to define 
a state space model based on nonlinear differential equations. 

Section[3]is important as it describes the incorporation of the LMA methods into the GNF to produce the Morrison 
LMA, which converges very robustly . The performance of the new filters is demonstrated in a series of simulations 
described in Section |4] The paper concludes with a summary and indication of future work. 

2. State space model based on nonlinear differential equations 

Consider the following autonomous, nonlinear differential equation (DE) governing the process state: 

DX(t) = F(X(t)) (1) 

in which F is a non linear vector function of the state vector X describing a process, such as the position of a target 
in space. We assume the observation scheme of the process is a nonlinear function of the process state with expression 



. We make it clear here that the LMA is not applied 



Y(t) = G(X(t)) + v(f) (2) 

where G is a nonlinear function of X and v(f) is a random Gaussian vector. The goal is to estimate the process 
state from the given state nonlinear models. For linear DEs, the state transition matrix could be easily obtained. This, 
however, is not the case with a nonlinear DEs. Nevertheless, there is a procedure, based on local linearisation, that 
enables us to get around this obstacle, which we will now present. 

2.1. The method of local linearisation 

The solution of the DE gives rise to infinitely many trajectories that are dependent on the initial condition. However 

there will be one trajectory whose state vector the filter will attempt to identify from the observations. We assume that 

there is a known nominal trajectory with state vector X(t) that has the following properties: 

2 



/ Digital Signal Processing 00 (2011) 7-?? 3 

• X(t) satisfies the same DE as X(t) 

• X(t) is close to X(t) 

The above-mentioned properties result in the following expression:^. 

X(i)=X(i) + 6X(i) (3) 

where 5X(t) is a vector of time-dependent functions that are small in relation to the corresponding elements of 
either X(t) or X(t), The vector SX(t) is called the perturbation vector and is governed by the following DE (The 
derivation is shown in Appendix A): 

D(6X(t)) = A(X(t))6X(t) (4) 
where A(X(t)) is called a sensitivity matrix defined as follows: 



(5) 

X(f) 



d(X(t)) 

This equation is therefore a linear DE, with a time varying coefficient and has a the following transition equation: 

6X(t + = ®(?n + £, t„,X)6X(t)subsec : local (6) 



in which 3>(Y„ + t n ,X) is the transition matrix from time t„ to t„ + £ (increment £). The transition matrix is 
governed by the following DE: 



— <b{t nH , t n , x) = A(X(t n + omt nH , t n , X) (7) 

<D(f„,f„,l) = / (8) 

The transition matrix is a function of X(t) and can be evaluated by numerical integration and in order to fill the 
values of A(X(t n + 0), X(t) has to be integrated numerically. 

2.2. The observation perturbation vector 

In this section we will adopt the notation X„ and Y„ for X(t n ) and Y(t n ) respectively. We define a simulated noise 
free observation vector Y„ as follows: 

Y„ = G(X n ) (9) 

Subtracting Y„ from the actual observation Y„ gives the observation perturbation vector: 
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6Y„ = Y„ - Y„ 



(10) 



In Appendix A we show that the observation perturbation vector is related to the state perturbation vector as 
follows: 



6Y„ = M(X„)6X n + v„ 



(11) 



where M(X„) is the Jacobean matrix of G, evaluated at X„. The matrix is also called the observation sensitivity 
matrix and is defined as follows: 



M(X n ) = 



dF(X„) 



d(X n ) 



(12) 



We now examine the sequence of observations. 



2.3. Sequence of observation 

We assume that L + 1 observation are obtained with time stamps t„, t„-\, f„_£,Theses observations are assembled 
as follows : 



6Y n 




M(X„)5X n 




Vn 


SY n ^ 




M(X„^)6X„^ 


+ 


Vn-1 


5Y n - L 




M(X n . L )dX n . L 




Vn-L 



Using the relationship: 

6X m = <b{t m ,t n ,X)5X n 
then, substituting Equation[13]the observation sensitivity equation can be written as 

6Y„ = T n 6X n + V„ 

in which T„, the total observation matrix is defined as follows: 



(13) 



(14) 



(15) 



M(X„) 
M(X„^(t n ^,t n ;X) 

T„ = 



M(X„_ L )0(f„_ L ,?„;X) 
4 



(16) 
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The vectors 5Y„ and V„ are large. The cost function we would want to minimize is : 
efE(5X n ) = (tfY„ - \ n fK;\5Y n - V„) = 5X T n {T T n YC n l T n )6X n 
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(17) 



The solution that minimises the cost function can be obtained from the minimum variance estimation as follows: 

5X n = (TX-'^rXR,; 1 ^ (18) 

The estimate 6X„ has a covariance matrix: 



S n - (T n R„ T„) 1 



(19) 



where R ;l 1 is a block diagonal weight matrix, also called the least squares weight matrix, but, in fact, if we define R n 
as the covariance matrix of the the error vector v„, then R" 1 is expressed as: 



r: 



R7, 















o K-l 



(20) 



In this section we arrived at a form of filter that uses the minimum variance estimation method, initiated by 
Gauss in "Theoria Combinationis Observationum Erroribus Minimis Obnoxiae," and the local linearisation technique 
championed by Newton to estimate the state of the process from the non linear observation scheme. This filter is 
called Gauss-Newton filter (GNF) and is described in detail in Morrison's work yj]. 



3. Adaptation to Levenberg Marquard 

This section represents the key step in the development of the Morrison LMA Filter. For simplicity in adaptation 
of the GNF to the Levenberg Marquard method we assume the dynamic of of the process we want to track is governed 
by linear differential equations and the observation scheme is non linear. The process transition equation will be: 



X n+ , = ®(s)X n (21) 

The GNF will fail to converge if the matrix (T^R^T,,)- 1 is singular. By definition this matrix is positive definite 
However, it can loose this property due to numerical inaccuracy or high non-linearity. To avoid the singularity, a 
damping factor is introduced in equation as follows: 
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5X n = (T^R^T,, +fiiy l T T n R; l 5Y„ 



(22) 



which is the form suggested by Levenberg and Marquardt [ 
The effect of the damping factor is as follows: 

• For all positive fi the matrix (T^R^'Tn + is positive definite, ensuring that 6X is in the descent direction; 

• When is large we have: 



6X = -TlR'JdYn 



(23) 



The algorithm behaves as a steepest descent which is ideal when the current solution is far from the local minimum. 
The convergence will be slow but however guaranteed. When p is small, the algorithm has faster convergence and 
behaves like the Gauss-Newton. 

The damping factor can be updated by the gain ratio: 

c?Y„ r R-'c?Y„ - (Y„ - Y) r R„-'(Y„ - Y) 

Q = (24) 

E(6X n ) 

where <5Y„ is the long vector of L sequences of obserservation including the current observation. 
Y is the long error free observation computed by back propagation of the current iterate X new . If we sample at 
constant rate q then: 



G(X new ) 
G(Q>(-s)X mw ) 



(25) 



G{<b{-{L-\)s)X new ) 

The numerator is the actual computed gain and the denominator is the predicted gain. Recalling equation [P71 and 
replacing 6X n by the expression in equation[18]we have: 



E(6X„) = 6XT(TlR- n l T„){Tj,R- l T„ + ijir l TlK l W n 



(26) 



which reduces to: 



E{5X„) = SX T JJ T n R- n x S\„ + fi6X„) (27) 

A large value of q indicates that E(SX n ) is a good approximation of Y, and yi can be decreased so that the next 
Levenberg-Marquardt step is closer to the Gauss-Newton step. If q is small or negative then E(SX„) is a poor approx- 
imation, then /i should be increased to move closer to the steepest descent direction. The algorithm adapted from J15I 
is presented as follows: 

subsec:local 
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Algorithm 1 L-M algorithm for tracking system 
k := 0,v := 2,X := X„- Un 

A := T^R^Tn^Yn := Y„ - Y„; g := TjR^JY,,; Y„ is computed using X 
stop := false;ii — t * max(diag(A)); 
While (not stop) and (& < £ max ) 

fc := fc+ 1 

repeat 

solve (A + nI)5X n = g 
if(||tf* B ||<c||X||) 

stop:=true; 
else 

X ne w '■— X + 5X n ; 

q = [6Y T n R- n l 6Y„ - (Y„ - Y) r R„- 1 (Y„ - Y)]l[5X T n {g + //&£„)]; Y evaluated at X, 
ifg > 

X — X new , 

A := TlR-'TnMr, := Y„ - Y„; g := TjR^tfY,,; Y„ is computed using X 
H=n* max(l/3, 1 - (2g + l) 3 );v := 2; 

else 

/i := v * //; 
v := 2 * v; 

endif 
endif 
until(p > O)or(stop); 
endwhile 

X n / n — X; 

X n/n+1 = ®(s)X n/n 
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4. Simulations 
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In the simulation studies we adopt multiple target dynamics: 

• Case 1 : The target is moving at constant velocity under the process noise of constant standard deviation. 

• Case 2 : The target moving at constant velocity with process noise standard deviation varying 

In all the cases the observation scheme is non linear. The observables are range p,bearing 0,elevation 9 and Doppler 
fd- The observation equation is therefore defined as follows: 



yjx 2 +y 2 + z 2 
tan~ l (y/x) 
tanT x {zj V* 2 + y 2 ) 



K d 



xx+yy+zz 



+ v(0 



(28) 



R 



where v(r) is vector of random variables with covariance 

60 2 

0.001 2 

0.001 2 

2 2 

throughout the simulations. Kj = -2n/A = -200. The constants r = 10 _1 , s = 10~ 20 , k max = 200, £ = Is are used in 
all the cases. 



4.1. Casel 

In this example, we seek to demonstrate that the filter does not diverge in the presence of constant variance process 
which is unknown to its model. : The target state vector X = [x, x,y, y,z,z] T is defined by the following transition 
equation: 

r 1 $ 
1 
1 <r 
1 

1 ? 
1 

where a\,a\,a\ are independent, Gaussian random variables, with standard deviation cr = 0.001. The state vector 
is used in to generate measurements for the simulation. The filter, however, does not depend on the process noise, it 
assumes the target is moving at constant speed without process noise. 

8 



X„+i - 





\a x q 2 




a\q 




\a 2 q 2 










ifl 3? 2 




a 3 g 



(29) 
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The initial value of the state vector is X = [800,25, 1000, -25,400, 14]. Two thousand samples are generated 
and the process is repeated 50 times. The position root mean squared error (RMSE) after the 50 Monte Carlo runs is 
presented in Figure [1]. The position RMSE is computed as follows : 



where {x l n ,y l n ,t^} and (x n ,y„,z n ) true and estimated position coordinates respectively. 

We see from Figure [1] that there is no divergence in position despite the presence of the process noise, which 
is unknown to the filter. The filter with the smallest memory exhibits the largest RMSE. The average number of 
iterations is presented in Fgure [2]. All the filters have about the same value of k, which is around 34, meaning the 
computation time of the algorithm is primarily dependent on the computation of the T„ matrix. Therefore if we want to 
reduce the computation time of the algorithm, we would choose a small memory length (the T„ matrix will be small 
and hence less computation), but this would result in less accuracy in the estimates. 



Here we show the effect of higher variation in the process noise on the filter performance. In this case the target 
dynamic model is the same as in Case 1 . The standard deviation(cr) of the process noise is varied. From sample to 
200 cr = 0.001, between samples 201 and 260 <x = 0.05 and finally from sample 261 to 400 <x = 0.001. The position 
RMSE after 200 Montecarlo runs is shown in Figure [3]. All the filters reset to the original RMSE when the process 
noise standard deviation returned to the former value. The RMSE of filter with the smallest memory length is less 
affected by these changes. However the number of iterations during high disturbance is higher for the smaller memory 
length filter (Figure [4]). Theses results highlight the adaptiveness of the algorithm to disturbance. They also give a 
hint about the ability of the filter to track manoeuvre. This will be the subject of our next pubiblication. 

5. Conclusion 

This paper introduced the standard Gauss Newton filter that uses the back propagation of the predicted state vector 
over a finite memory length to compute the Jacobian matrix. It then computes the current estimate of the state vector 
through the minimum variance estimation. The Gauss Newton was then adapted to the Levenberg and Marquard 
method to guarantee its convergence all the time. 

The adapted algorithm was used in simulations to track targets in Cartesian coordinates when the observations 
consist of range, bearing , elevation and Doppler. The results highlight the robustness of the new, Morrison LMA filter 
which can withstand strong random disturbance and nonlinear trajectories. We observed from simulations studies that 
by adaptively changing the memory length, the filter will be able to track maneuvres. Such memory control algorithm 
will be a subject of our next publication. 




(30) 



4.2. Case 2 
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Appendix A. 
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Appendix A.l. The differential equation governing 5X(t) 
Starting from : 

SX(t) = X(t) - X(t) 

The differentiation rule is applied: 



Let F be defined as follows : 



D6X(t) = F(X(t) + SX(t)) - F(X(t)) 



fx 



F = 



Equation becomes : 



D6X(t) = 



fn 



MX(t) + 6X(t)) MX(t)) 



f n (X(t) + 5X{t)) f n (X(t)) 



The Taylor first order approximation is applied: 





MX(t)) 




V/i(X(f)) r 




DSX(t) = 




+ 




SX(t) 




fn(X(t)) 




v/„(X(0) r 





MX(t)) 



fn(X(t)) 



The following relation is obtained : 



D6X(t) = A(X(t))6X(t) 
10 



Where: 



/ Digital Signal Processing 00 (2011) 7-?? 
V/!(l(f)) r 



11 



A(X(i)) = 



dF(X(t)) 



(A.7) 



d(X(t)) m 



Vfn(X(t)) 



Appendix A.2. The relation between 6X n and 6Y, 



n 



6Y n = G(X n + 6X„) - G(X„) 



(A.8) 



As direct consequence of Appendix A. 1 the following relationship is obtained: 



6Y„ = M(X„)6X„ + v, 



(A.9) 
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